Unsupervised learning of regression mixture models with unknown number of components
نویسنده
چکیده
Regression mixture models are widely studied in statistics, machine learning and data analysis. Fitting regression mixtures is challenging and is usually performed by maximum likelihood by using the expectation-maximization (EM) algorithm. However, it is well-known that the initialization is crucial for EM. If the initialization is inappropriately performed, the EM algorithm may lead to unsatisfactory results. The EM algorithm also requires the number of clusters to be given a priori; the problem of selecting the number of mixture components requires using model selection criteria to choose one from a set of pre-estimated candidate models. We propose a new fully unsupervised algorithm to learn regression mixture models with unknown number of components. The developed unsupervised learning approach consists in a penalized maximum likelihood estimation carried out by a robust expectation-maximization (EM) algorithm for fitting polynomial, spline and B-spline regressions mixtures. The proposed learning approach is fully unsupervised: 1) it simultaneously infers the model parameters and the optimal number of the regression mixture components from the data as the learning proceeds, rather than in a two-fold scheme as in standard model-based clustering using afterward model selection criteria, and 2) it does not require accurate initialization unlike the standard EM for regression mixtures. The developed approach is applied to curve clustering problems. Numerical experiments on simulated data show that the proposed robust EM algorithm performs well and provides accurate results in terms of robustness with regard initialization and retrieving the optimal partition with the actual number of clusters. An application to real data in the framework of functional data clustering, confirms the benefit of the proposed approach for practical applications.
منابع مشابه
Model Selection for Mixture Models Using Perfect Sample
We have considered a perfect sample method for model selection of finite mixture models with either known (fixed) or unknown number of components which can be applied in the most general setting with assumptions on the relation between the rival models and the true distribution. It is, both, one or neither to be well-specified or mis-specified, they may be nested or non-nested. We consider mixt...
متن کاملFully Nonparametric Probability Density Function Estimation with Finite Gaussian Mixture Models
Flexible and reliable probability density estimation is fundamental in unsupervised learning and classification. Finite Gaussian mixture models are commonly used to serve this purpose. However, they fail to estimate unknown probability density functions when used for nonparametric probability density estimation, as severe numerical difficulties may occur when the number of components increases....
متن کاملUnsupervised Learning of Gamma Mixture Models Using Minimum Message Length
Mixture modelling or unsupervised classification is a problem of identifying and modelling components in a body of data. Earlier work in mixture modelling using Minimum Message Length (MML) includes the multinomial and Gaussian distributions (Wallace and Boulton, 1968), the von Mises circular and Poisson distributions (Wallace and Dowe, 1994, 2000) and the distribution (Agusta and Dowe, 2002a, ...
متن کاملAn Overview of the New Feature Selection Methods in Finite Mixture of Regression Models
Variable (feature) selection has attracted much attention in contemporary statistical learning and recent scientific research. This is mainly due to the rapid advancement in modern technology that allows scientists to collect data of unprecedented size and complexity. One type of statistical problem in such applications is concerned with modeling an output variable as a function of a sma...
متن کاملDiscriminative Mixture-of-Templates for Viewpoint Classification
Object viewpoint classification aims at predicting an approximate 3D pose of objects in a scene and is receiving increasing attention. State-of-the-art approaches to viewpoint classification use generative models to capture relations between object parts. In this work we propose to use a mixture of holistic templates (e.g. HOG) and discriminative learning for joint viewpoint classification and ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1409.6981 شماره
صفحات -
تاریخ انتشار 2014